STAT 331

strings

A string is a bunch of characters.

Don’t confuse a string (many characters, one object) with a character vector (vector of strings).


my_string <- "Hi, my name is Bond!"
my_vector <- c("Hi", "my", "name", "is", "Bond")


my_string
[1] "Hi, my name is Bond!"


my_vector
[1] "Hi"   "my"   "name" "is"   "Bond"

stringr

Common tasks

  • Find which strings contain a particular pattern

  • Remove or replace a pattern

  • Edit a string (for example, make it lowercase)

Note

The package stringr is very useful for strings!

  • stringr does load with the tidyverse.

  • all the functions are str_xxx().

pattern =

The pattern argument in all of the stringr functions …

my_vector <- c("Hello,", "my name is", "Bond", "James Bond")

str_detect(my_vector, pattern = "Bond")
str_locate(my_vector, pattern = "Bond")
str_match(my_vector, pattern = "Bond")
str_extract(my_vector, pattern = "Bond")
str_subset(my_vector, "pattern = Bond")

Note

Discuss with a neighbor. For each of these functions, give:

  • The object structure of the output.
  • The data type of the output.
  • A brief explanation of what they do.

str_detect()

Returns logical vector TRUE/FALSE indicating if the pattern was found in that element of the original vector

my_vector <- c("Hello,", "my name is", "Bond", "James Bond")
str_detect(my_vector, pattern = "Bond")
[1] FALSE FALSE  TRUE  TRUE


  • Pairs well with filter()
  • Could be used with summarise() and sum() or mean()

Related functions

str_subset() returns just the strings that contain the match

str_which() returns the indexes of strings that have a match

str_match()

Returns character matrix with either NA or the pattern, depending on if the pattern was found.


my_vector <- c("Hello,", "my name is", "Bond", "James Bond")
str_match(my_vector, pattern = "Bond")
     [,1]  
[1,] NA    
[2,] NA    
[3,] "Bond"
[4,] "Bond"

str_extract()

Returns character vector with either NA or the pattern, depending on if the pattern was found.


my_vector <- c("Hello,", "my name is", "Bond", "James Bond")
str_extract(my_vector, pattern = "Bond")
[1] NA     NA     "Bond" "Bond"


Warning

str_extract() only returns the first pattern match.

Use str_extract_all() to return every pattern match.

str_locate()

Returns a date frame with two numeric variables for the starting and ending location, giving either NA or the start and end position of the pattern.


my_vector <- c("Hello,", "my name is", "Bond", "James Bond")
str_locate(my_vector, pattern = "Bond")
     start end
[1,]    NA  NA
[2,]    NA  NA
[3,]     1   4
[4,]     7  10

str_subset()

Returns a character vector with a subset of the original character vector with elements where the pattern occurs.


my_vector <- c("Hello,", "my name is", "Bond", "James Bond")
str_subset(my_vector, pattern = "Bond")
[1] "Bond"       "James Bond"


Related Functions

str_sub() extracts values based on location (starting and ending position).

Replace / Remove patterns

Replaces the first matched pattern

  • Pairs well with mutate()


str_replace(my_vector, pattern = "Bond", replace = "Franco")
[1] "Hello,"       "my name is"   "Franco"       "James Franco"

Removes the first matched pattern

  • You could think of str_remove() as a special case of str_replace( ), where the replace argument is empty ("").
my_vector <- c("Hello,", "my name is", "Bond", "James Bond")
str_remove(my_vector, pattern = "Bond")
[1] "Hello,"     "my name is" ""           "James "    

Related functions

str_replace_all() replaces all matched patterns

str_remove_all() removes all matched patterns

Make edits

Convert letters in the string to a specific capitalization format.

converts all letters in the strings to lowercase


my_vector <- c("Hello,", "my name is", "Bond", "James Bond")
str_to_lower(my_vector)
[1] "hello,"     "my name is" "bond"       "james bond"

converts all letters in the strings to uppercase

str_to_upper(my_vector)
[1] "HELLO,"     "MY NAME IS" "BOND"       "JAMES BOND"

converts the first letter of the strings to uppercase

str_to_title(my_vector)
[1] "Hello,"     "My Name Is" "Bond"       "James Bond"

Combine Strings

Joins multiple strings into a single string

  • sep argument declares how the strings should be separated when pasting
prompt <- "Hello, my name is"
first  <- "James"
last   <- "Bond"
str_c(prompt, last, ",", first, last, sep = " ")
[1] "Hello, my name is Bond , James Bond"

Note

Similar to paste() and paste0()

Combines into a single string.

my_vector <- c("Hello,", "my name is", "Bond", "James Bond")
str_flatten(my_vector, collapse = " ")
[1] "Hello, my name is Bond James Bond"

Note

str_c() and glue() work well with mutate() – their output is the same length as their inputs. str_flatten() works well with summarise() – it always returns a single string!

Uses environment to create a string and evaluates {expressions}.

first <- "James"
last <- "Bond"
str_glue("My name is {last}, {first} {last}")
My name is Bond, James Bond

Tip

We will use the glue package a lot in Week 7!

Tips for Success

  • Refer to the stringr cheatsheet

  • Remember that str_xxx functions need the first argument to be a vector of strings, not a data set.

    • You might want to use them inside functions like filter() or mutate().
cereal |> 
  mutate(
    is_bran = str_detect(name, "Bran"), 
    .after = name
  )
                                     name is_bran manuf type calories protein
1                               100% Bran    TRUE     N cold       70       4
2                       100% Natural Bran    TRUE     Q cold      120       3
3                                All-Bran    TRUE     K cold       70       4
4               All-Bran with Extra Fiber    TRUE     K cold       50       4
5                          Almond Delight   FALSE     R cold      110       2
6                 Apple Cinnamon Cheerios   FALSE     G cold      110       2
7                             Apple Jacks   FALSE     K cold      110       2
8                                 Basic 4   FALSE     G cold      130       3
9                               Bran Chex    TRUE     R cold       90       2
10                            Bran Flakes    TRUE     P cold       90       3
11                           Cap'n'Crunch   FALSE     Q cold      120       1
12                               Cheerios   FALSE     G cold      110       6
13                  Cinnamon Toast Crunch   FALSE     G cold      120       1
14                               Clusters   FALSE     G cold      110       3
15                            Cocoa Puffs   FALSE     G cold      110       1
16                              Corn Chex   FALSE     R cold      110       2
17                            Corn Flakes   FALSE     K cold      100       2
18                              Corn Pops   FALSE     K cold      110       1
19                          Count Chocula   FALSE     G cold      110       1
20                     Cracklin' Oat Bran    TRUE     K cold      110       3
21                 Cream of Wheat (Quick)   FALSE     N  hot      100       3
22                                Crispix   FALSE     K cold      110       2
23                 Crispy Wheat & Raisins   FALSE     G cold      100       2
24                            Double Chex   FALSE     R cold      100       2
25                            Froot Loops   FALSE     K cold      110       2
26                         Frosted Flakes   FALSE     K cold      110       1
27                    Frosted Mini-Wheats   FALSE     K cold      100       3
28 Fruit & Fibre Dates; Walnuts; and Oats   FALSE     P cold      120       3
29                          Fruitful Bran    TRUE     K cold      120       3
30                         Fruity Pebbles   FALSE     P cold      110       1
31                           Golden Crisp   FALSE     P cold      100       2
32                         Golden Grahams   FALSE     G cold      110       1
33                      Grape Nuts Flakes   FALSE     P cold      100       3
34                             Grape-Nuts   FALSE     P cold      110       3
35                     Great Grains Pecan   FALSE     P cold      120       3
36                       Honey Graham Ohs   FALSE     Q cold      120       1
37                     Honey Nut Cheerios   FALSE     G cold      110       3
38                             Honey-comb   FALSE     P cold      110       1
39            Just Right Crunchy  Nuggets   FALSE     K cold      110       2
40                 Just Right Fruit & Nut   FALSE     K cold      140       3
41                                    Kix   FALSE     G cold      110       2
42                                   Life   FALSE     Q cold      100       4
43                           Lucky Charms   FALSE     G cold      110       2
44                                  Maypo   FALSE     A  hot      100       4
45       Muesli Raisins; Dates; & Almonds   FALSE     R cold      150       4
46      Muesli Raisins; Peaches; & Pecans   FALSE     R cold      150       4
47                   Mueslix Crispy Blend   FALSE     K cold      160       3
48                   Multi-Grain Cheerios   FALSE     G cold      100       2
49                       Nut&Honey Crunch   FALSE     K cold      120       2
50              Nutri-Grain Almond-Raisin   FALSE     K cold      140       3
51                      Nutri-grain Wheat   FALSE     K cold       90       3
52                   Oatmeal Raisin Crisp   FALSE     G cold      130       3
53                  Post Nat. Raisin Bran    TRUE     P cold      120       3
54                             Product 19   FALSE     K cold      100       3
55                            Puffed Rice   FALSE     Q cold       50       1
56                           Puffed Wheat   FALSE     Q cold       50       2
57                     Quaker Oat Squares   FALSE     Q cold      100       4
58                         Quaker Oatmeal   FALSE     Q  hot      100       5
59                            Raisin Bran    TRUE     K cold      120       3
60                        Raisin Nut Bran    TRUE     G cold      100       3
61                         Raisin Squares   FALSE     K cold       90       2
62                              Rice Chex   FALSE     R cold      110       1
63                          Rice Krispies   FALSE     K cold      110       2
64                         Shredded Wheat   FALSE     N cold       80       2
65                 Shredded Wheat 'n'Bran    TRUE     N cold       90       3
66              Shredded Wheat spoon size   FALSE     N cold       90       3
67                                 Smacks   FALSE     K cold      110       2
68                              Special K   FALSE     K cold      110       6
69                Strawberry Fruit Wheats   FALSE     N cold       90       2
70                      Total Corn Flakes   FALSE     G cold      110       2
71                      Total Raisin Bran    TRUE     G cold      140       3
72                      Total Whole Grain   FALSE     G cold      100       3
73                                Triples   FALSE     G cold      110       2
74                                   Trix   FALSE     G cold      110       1
75                             Wheat Chex   FALSE     R cold      100       3
76                               Wheaties   FALSE     G cold      100       3
77                    Wheaties Honey Gold   FALSE     G cold      110       2
   fat sodium fiber carbo sugars potass vitamins shelf weight cups   rating
1    1    130  10.0   5.0      6    280       25     3   1.00 0.33 68.40297
2    5     15   2.0   8.0      8    135        0     3   1.00 1.00 33.98368
3    1    260   9.0   7.0      5    320       25     3   1.00 0.33 59.42551
4    0    140  14.0   8.0      0    330       25     3   1.00 0.50 93.70491
5    2    200   1.0  14.0      8     -1       25     3   1.00 0.75 34.38484
6    2    180   1.5  10.5     10     70       25     1   1.00 0.75 29.50954
7    0    125   1.0  11.0     14     30       25     2   1.00 1.00 33.17409
8    2    210   2.0  18.0      8    100       25     3   1.33 0.75 37.03856
9    1    200   4.0  15.0      6    125       25     1   1.00 0.67 49.12025
10   0    210   5.0  13.0      5    190       25     3   1.00 0.67 53.31381
11   2    220   0.0  12.0     12     35       25     2   1.00 0.75 18.04285
12   2    290   2.0  17.0      1    105       25     1   1.00 1.25 50.76500
13   3    210   0.0  13.0      9     45       25     2   1.00 0.75 19.82357
14   2    140   2.0  13.0      7    105       25     3   1.00 0.50 40.40021
15   1    180   0.0  12.0     13     55       25     2   1.00 1.00 22.73645
16   0    280   0.0  22.0      3     25       25     1   1.00 1.00 41.44502
17   0    290   1.0  21.0      2     35       25     1   1.00 1.00 45.86332
18   0     90   1.0  13.0     12     20       25     2   1.00 1.00 35.78279
19   1    180   0.0  12.0     13     65       25     2   1.00 1.00 22.39651
20   3    140   4.0  10.0      7    160       25     3   1.00 0.50 40.44877
21   0     80   1.0  21.0      0     -1        0     2   1.00 1.00 64.53382
22   0    220   1.0  21.0      3     30       25     3   1.00 1.00 46.89564
23   1    140   2.0  11.0     10    120       25     3   1.00 0.75 36.17620
24   0    190   1.0  18.0      5     80       25     3   1.00 0.75 44.33086
25   1    125   1.0  11.0     13     30       25     2   1.00 1.00 32.20758
26   0    200   1.0  14.0     11     25       25     1   1.00 0.75 31.43597
27   0      0   3.0  14.0      7    100       25     2   1.00 0.80 58.34514
28   2    160   5.0  12.0     10    200       25     3   1.25 0.67 40.91705
29   0    240   5.0  14.0     12    190       25     3   1.33 0.67 41.01549
30   1    135   0.0  13.0     12     25       25     2   1.00 0.75 28.02576
31   0     45   0.0  11.0     15     40       25     1   1.00 0.88 35.25244
32   1    280   0.0  15.0      9     45       25     2   1.00 0.75 23.80404
33   1    140   3.0  15.0      5     85       25     3   1.00 0.88 52.07690
34   0    170   3.0  17.0      3     90       25     3   1.00 0.25 53.37101
35   3     75   3.0  13.0      4    100       25     3   1.00 0.33 45.81172
36   2    220   1.0  12.0     11     45       25     2   1.00 1.00 21.87129
37   1    250   1.5  11.5     10     90       25     1   1.00 0.75 31.07222
38   0    180   0.0  14.0     11     35       25     1   1.00 1.33 28.74241
39   1    170   1.0  17.0      6     60      100     3   1.00 1.00 36.52368
40   1    170   2.0  20.0      9     95      100     3   1.30 0.75 36.47151
41   1    260   0.0  21.0      3     40       25     2   1.00 1.50 39.24111
42   2    150   2.0  12.0      6     95       25     2   1.00 0.67 45.32807
43   1    180   0.0  12.0     12     55       25     2   1.00 1.00 26.73451
44   1      0   0.0  16.0      3     95       25     2   1.00 1.00 54.85092
45   3     95   3.0  16.0     11    170       25     3   1.00 1.00 37.13686
46   3    150   3.0  16.0     11    170       25     3   1.00 1.00 34.13976
47   2    150   3.0  17.0     13    160       25     3   1.50 0.67 30.31335
48   1    220   2.0  15.0      6     90       25     1   1.00 1.00 40.10596
49   1    190   0.0  15.0      9     40       25     2   1.00 0.67 29.92429
50   2    220   3.0  21.0      7    130       25     3   1.33 0.67 40.69232
51   0    170   3.0  18.0      2     90       25     3   1.00 1.00 59.64284
52   2    170   1.5  13.5     10    120       25     3   1.25 0.50 30.45084
53   1    200   6.0  11.0     14    260       25     3   1.33 0.67 37.84059
54   0    320   1.0  20.0      3     45      100     3   1.00 1.00 41.50354
55   0      0   0.0  13.0      0     15        0     3   0.50 1.00 60.75611
56   0      0   1.0  10.0      0     50        0     3   0.50 1.00 63.00565
57   1    135   2.0  14.0      6    110       25     3   1.00 0.50 49.51187
58   2      0   2.7  -1.0     -1    110        0     1   1.00 0.67 50.82839
59   1    210   5.0  14.0     12    240       25     2   1.33 0.75 39.25920
60   2    140   2.5  10.5      8    140       25     3   1.00 0.50 39.70340
61   0      0   2.0  15.0      6    110       25     3   1.00 0.50 55.33314
62   0    240   0.0  23.0      2     30       25     1   1.00 1.13 41.99893
63   0    290   0.0  22.0      3     35       25     1   1.00 1.00 40.56016
64   0      0   3.0  16.0      0     95        0     1   0.83 1.00 68.23588
65   0      0   4.0  19.0      0    140        0     1   1.00 0.67 74.47295
66   0      0   3.0  20.0      0    120        0     1   1.00 0.67 72.80179
67   1     70   1.0   9.0     15     40       25     2   1.00 0.75 31.23005
68   0    230   1.0  16.0      3     55       25     1   1.00 1.00 53.13132
69   0     15   3.0  15.0      5     90       25     2   1.00 1.00 59.36399
70   1    200   0.0  21.0      3     35      100     3   1.00 1.00 38.83975
71   1    190   4.0  15.0     14    230      100     3   1.50 1.00 28.59278
72   1    200   3.0  16.0      3    110      100     3   1.00 1.00 46.65884
73   1    250   0.0  21.0      3     60       25     3   1.00 0.75 39.10617
74   1    140   0.0  13.0     12     25       25     2   1.00 1.00 27.75330
75   1    230   3.0  17.0      3    115       25     1   1.00 0.67 49.78744
76   1    200   3.0  17.0      3    110       25     1   1.00 1.00 51.59219
77   1    200   1.0  16.0      8     60       25     1   1.00 0.75 36.18756

regex

Regular Expressions




“Regexps are a very terse language that allow you to describe patterns in strings.”

R for Data Science

R uses “extended” regular expressions, which are common.

str_detect(string  = my_string_vector, 
           pattern = "REGULAR EXPRESSION"
           )

Tip

Regular expressions are a reason to use stringr!

You might encounter gsub(), grep(), etc. from Base R.

Meta Characters . ^ $ \ | * + ? { } [ ] ( )

toung_twister <- c("She", "sells", "seashells", "by", "the", "seashore!")

. Represents any character

str_subset(toung_twister, pattern = ".ells")
[1] "sells"     "seashells"
toung_twister <- c("She", "sells", "seashells", "by", "the", "seashore!")


^ Looks at the beginning

str_subset(toung_twister, 
           pattern = "^s")
[1] "sells"     "seashells" "seashore!"

$ Looks at the end

str_subset(toung_twister, 
           pattern = "s$")
[1] "sells"     "seashells"
shells_str <- c("shes", "shels", "shells", "shellls", "shelllls")

? Occurs 0 or 1 times

str_subset(shells_str, 
           pattern = "shel?s")
[1] "shes"  "shels"

+ Occurs 1 or more times

str_subset(shells_str, 
           pattern = "shel+s")
[1] "shels"    "shells"   "shellls"  "shelllls"

* Occurs 0 or more times

str_subset(shells_str, 
           pattern = "shel*s")
[1] "shes"     "shels"    "shells"   "shellls"  "shelllls"
shells_str <- c("shes", "shels", "shells", "shellls", "shelllls")

{n} matches exactly n times.

str_subset(shells_str, 
           pattern = "shel{2}s")
[1] "shells"

{n,} matches at least n times.

str_subset(shells_str, 
           pattern = "shel{2,}s")
[1] "shells"   "shellls"  "shelllls"

{n,m} matches between n and m times.

str_subset(shells_str, 
           pattern = "shel{1,3}s")
[1] "shels"   "shells"  "shellls"

Groups ()

  • Similar to our filter() statements, | means “or”
toung_twister2 <- c("Peter", "Piper", "picked", "a",
                    "peck", "of", "pickled", "peppers!")
str_subset(toung_twister2, 
           pattern = "p(e|i)ck")
[1] "picked"  "peck"    "pickled"

Tip

A lot of the benefit of groups is the ability to back reference positions of a string.

Character Classes

are defined by [] and let you match a set of characters

toung_twister2 <- c("Peter", "Piper", "picked", "a", 
                    "peck", "of", "pickled", "peppers!")


str_subset(toung_twister2, 
           pattern = "p[ei]ck")
[1] "picked"  "peck"    "pickled"

Note

Notice how you could have accomplished the same thing with pattern = "p(e|i)ck"?

More Character Classes!

[^ ] except - think “not”

str_subset(toung_twister2, pattern = "p[^i]ck")
[1] "peck"

[ - ] range

str_subset(toung_twister2, pattern = "p[ei]ck[a-z]")
[1] "picked"  "pickled"

[Pp] Capitalization matters

str_subset(toung_twister2, pattern = "^p")
[1] "picked"   "peck"     "pickled"  "peppers!"
str_subset(toung_twister2, pattern = "^[Pp]")
[1] "Peter"    "Piper"    "picked"   "peck"     "pickled"  "peppers!"

Even More Character Classes

  • [A-Z] matches any capital letter.
  • [a-z] matches any lowercase letter.
  • [A-z] or [:alpha:] matches any letter
  • [0-9] or [:digit:] matches any number
  • See the stringr cheatsheet for more shortcuts, like [:punct:]

\w Looks for any “word” (conversely “not” “word” \W)

\d Looks for any digit (conversely “not” digit \D)

\s Looks for any whitespace (conversely “not” whitespace \S)

Let’s try it out!

Discuss with a neighbor which regular expressions would search for words that do the following:

  • end with a vowel
  • start with x, y, or z
  • do not contain x, y, or z

Escape \

In order to match a special character you need to “escape” first

Warning

In general, look at punctuation characters with suspicion.

toung_twister3 <- c("How", "much", "wood", "could", 
                    "a", "woodchuck", "chuck", "if", "a", 
                    "woodchuck", "could", "chuck", "wood?")
str_subset(toung_twister3, pattern = "?")
Error in stri_subset_regex(string, pattern, omit_na = TRUE, negate = negate, : Syntax error in regex pattern. (U_REGEX_RULE_SYNTAX, context=`?`)


str_subset(toung_twister3, pattern = "\\?")
[1] "wood?"

When in Doubt



Use this web app to test R regular expressions

Tips for working with regex

  • Read the regular expressions out loud like a “request”

str_view() and str_view_all()

shells_str
[1] "shes"     "shels"    "shells"   "shellls"  "shelllls"
str_view(shells_str, "l+")
str_view_all(shells_str, "l+")

Tips for working with regex

  • Everyone has a love-hate relationship with regular expressions. Be kind to yourself.

strings in the tidyverse

matches(pattern)

Selects all variables with a name that matches the supplied pattern

  • pairs well with select(), rename_with(), and across()
military_clean <- military |> 
  mutate(across(`1988`:`2019`, 
                ~ na_if(.x, y = ". .")
                ),
         across(`1988`:`2019`, 
                ~ na_if(.x, y = "xxx")
                )
         )
military_clean <- military |> 
  mutate(
         across(matches("[1-9]{4}"), 
                ~ na_if(.x, y = ". .")
                ),
         across(matches("[1-9]{4}"), 
                ~ na_if(.x, y = "xxx")
                )
         )

“Messy” Covid Variants

A friend of mine received a dataset the other day from someone who asked if she knew how to “clean” it.

What is that column?! 😮

[{'variant': 'Other', 'cumWeeklySequenced': 2366843.0, 'newWeeklyPercentage': 4.59}, {'variant': 'V-20DEC-01 (Alpha)', 'cumWeeklySequenced': 0.0, 'newWeeklyPercentage': 0.0}, {'variant': 'V-21APR-02 (Delta B.1.617.2)', 'cumWeeklySequenced': 0.0, 'newWeeklyPercentage': 0.0}, {'variant': 'V-21OCT-01 (Delta AY 4.2)', 'cumWeeklySequenced': 0.0, 'newWeeklyPercentage': 0.0}, {'variant': 'V-22DEC-01 (Omicron CH.1.1)', 'cumWeeklySequenced': 2366843.0, 'newWeeklyPercentage': 24.56}, {'variant': 'V-22JUL-01 (Omicron BA.2.75)', 'cumWeeklySequenced': 2366843.0, 'newWeeklyPercentage': 8.93}, {'variant': 'V-22OCT-01 (Omicron BQ.1)', 'cumWeeklySequenced': 2366843.0, 'newWeeklyPercentage': 49.57}, {'variant': 'VOC-21NOV-01 (Omicron BA.1)', 'cumWeeklySequenced': 2366843.0, 'newWeeklyPercentage': 0.02}, {'variant': 'VOC-22APR-03 (Omicron BA.4)', 'cumWeeklySequenced': 2366843.0, 'newWeeklyPercentage': 0.08}, {'variant': 'VOC-22APR-04 (Omicron BA.5)', 'cumWeeklySequenced': 2366843.0, 'newWeeklyPercentage': 5.59}, {'variant': 'VOC-22JAN-01 (Omicron BA.2)', 'cumWeeklySequenced': 2366843.0, 'newWeeklyPercentage': 1.41}, {'variant': 'unclassified_variant', 'cumWeeklySequenced': 2366843.0, 'newWeeklyPercentage': 5.26}]

Enter stringr! 🎂

Let’s see how this works.

PA 5.2: Scrambled Message

In this activity, you will be using regular expressions to decode a message.

For the activity, you are able to choose what type of object you want to work with – a vector or a dataframe.

Dataframes

If you choose to work with dataframes, remember the stringr functions go inside dplyr verbs like mutate() and filter(). Think of them as you would as.factor().


cereal |> 
  mutate(
    is_bran = str_detect(name, "Bran")
  )
                                     name manuf type calories protein fat
1                               100% Bran     N cold       70       4   1
2                       100% Natural Bran     Q cold      120       3   5
3                                All-Bran     K cold       70       4   1
4               All-Bran with Extra Fiber     K cold       50       4   0
5                          Almond Delight     R cold      110       2   2
6                 Apple Cinnamon Cheerios     G cold      110       2   2
7                             Apple Jacks     K cold      110       2   0
8                                 Basic 4     G cold      130       3   2
9                               Bran Chex     R cold       90       2   1
10                            Bran Flakes     P cold       90       3   0
11                           Cap'n'Crunch     Q cold      120       1   2
12                               Cheerios     G cold      110       6   2
13                  Cinnamon Toast Crunch     G cold      120       1   3
14                               Clusters     G cold      110       3   2
15                            Cocoa Puffs     G cold      110       1   1
16                              Corn Chex     R cold      110       2   0
17                            Corn Flakes     K cold      100       2   0
18                              Corn Pops     K cold      110       1   0
19                          Count Chocula     G cold      110       1   1
20                     Cracklin' Oat Bran     K cold      110       3   3
21                 Cream of Wheat (Quick)     N  hot      100       3   0
22                                Crispix     K cold      110       2   0
23                 Crispy Wheat & Raisins     G cold      100       2   1
24                            Double Chex     R cold      100       2   0
25                            Froot Loops     K cold      110       2   1
26                         Frosted Flakes     K cold      110       1   0
27                    Frosted Mini-Wheats     K cold      100       3   0
28 Fruit & Fibre Dates; Walnuts; and Oats     P cold      120       3   2
29                          Fruitful Bran     K cold      120       3   0
30                         Fruity Pebbles     P cold      110       1   1
31                           Golden Crisp     P cold      100       2   0
32                         Golden Grahams     G cold      110       1   1
33                      Grape Nuts Flakes     P cold      100       3   1
34                             Grape-Nuts     P cold      110       3   0
35                     Great Grains Pecan     P cold      120       3   3
36                       Honey Graham Ohs     Q cold      120       1   2
37                     Honey Nut Cheerios     G cold      110       3   1
38                             Honey-comb     P cold      110       1   0
39            Just Right Crunchy  Nuggets     K cold      110       2   1
40                 Just Right Fruit & Nut     K cold      140       3   1
41                                    Kix     G cold      110       2   1
42                                   Life     Q cold      100       4   2
43                           Lucky Charms     G cold      110       2   1
44                                  Maypo     A  hot      100       4   1
45       Muesli Raisins; Dates; & Almonds     R cold      150       4   3
46      Muesli Raisins; Peaches; & Pecans     R cold      150       4   3
47                   Mueslix Crispy Blend     K cold      160       3   2
48                   Multi-Grain Cheerios     G cold      100       2   1
49                       Nut&Honey Crunch     K cold      120       2   1
50              Nutri-Grain Almond-Raisin     K cold      140       3   2
51                      Nutri-grain Wheat     K cold       90       3   0
52                   Oatmeal Raisin Crisp     G cold      130       3   2
53                  Post Nat. Raisin Bran     P cold      120       3   1
54                             Product 19     K cold      100       3   0
55                            Puffed Rice     Q cold       50       1   0
56                           Puffed Wheat     Q cold       50       2   0
57                     Quaker Oat Squares     Q cold      100       4   1
58                         Quaker Oatmeal     Q  hot      100       5   2
59                            Raisin Bran     K cold      120       3   1
60                        Raisin Nut Bran     G cold      100       3   2
61                         Raisin Squares     K cold       90       2   0
62                              Rice Chex     R cold      110       1   0
63                          Rice Krispies     K cold      110       2   0
64                         Shredded Wheat     N cold       80       2   0
65                 Shredded Wheat 'n'Bran     N cold       90       3   0
66              Shredded Wheat spoon size     N cold       90       3   0
67                                 Smacks     K cold      110       2   1
68                              Special K     K cold      110       6   0
69                Strawberry Fruit Wheats     N cold       90       2   0
70                      Total Corn Flakes     G cold      110       2   1
71                      Total Raisin Bran     G cold      140       3   1
72                      Total Whole Grain     G cold      100       3   1
73                                Triples     G cold      110       2   1
74                                   Trix     G cold      110       1   1
75                             Wheat Chex     R cold      100       3   1
76                               Wheaties     G cold      100       3   1
77                    Wheaties Honey Gold     G cold      110       2   1
   sodium fiber carbo sugars potass vitamins shelf weight cups   rating is_bran
1     130  10.0   5.0      6    280       25     3   1.00 0.33 68.40297    TRUE
2      15   2.0   8.0      8    135        0     3   1.00 1.00 33.98368    TRUE
3     260   9.0   7.0      5    320       25     3   1.00 0.33 59.42551    TRUE
4     140  14.0   8.0      0    330       25     3   1.00 0.50 93.70491    TRUE
5     200   1.0  14.0      8     -1       25     3   1.00 0.75 34.38484   FALSE
6     180   1.5  10.5     10     70       25     1   1.00 0.75 29.50954   FALSE
7     125   1.0  11.0     14     30       25     2   1.00 1.00 33.17409   FALSE
8     210   2.0  18.0      8    100       25     3   1.33 0.75 37.03856   FALSE
9     200   4.0  15.0      6    125       25     1   1.00 0.67 49.12025    TRUE
10    210   5.0  13.0      5    190       25     3   1.00 0.67 53.31381    TRUE
11    220   0.0  12.0     12     35       25     2   1.00 0.75 18.04285   FALSE
12    290   2.0  17.0      1    105       25     1   1.00 1.25 50.76500   FALSE
13    210   0.0  13.0      9     45       25     2   1.00 0.75 19.82357   FALSE
14    140   2.0  13.0      7    105       25     3   1.00 0.50 40.40021   FALSE
15    180   0.0  12.0     13     55       25     2   1.00 1.00 22.73645   FALSE
16    280   0.0  22.0      3     25       25     1   1.00 1.00 41.44502   FALSE
17    290   1.0  21.0      2     35       25     1   1.00 1.00 45.86332   FALSE
18     90   1.0  13.0     12     20       25     2   1.00 1.00 35.78279   FALSE
19    180   0.0  12.0     13     65       25     2   1.00 1.00 22.39651   FALSE
20    140   4.0  10.0      7    160       25     3   1.00 0.50 40.44877    TRUE
21     80   1.0  21.0      0     -1        0     2   1.00 1.00 64.53382   FALSE
22    220   1.0  21.0      3     30       25     3   1.00 1.00 46.89564   FALSE
23    140   2.0  11.0     10    120       25     3   1.00 0.75 36.17620   FALSE
24    190   1.0  18.0      5     80       25     3   1.00 0.75 44.33086   FALSE
25    125   1.0  11.0     13     30       25     2   1.00 1.00 32.20758   FALSE
26    200   1.0  14.0     11     25       25     1   1.00 0.75 31.43597   FALSE
27      0   3.0  14.0      7    100       25     2   1.00 0.80 58.34514   FALSE
28    160   5.0  12.0     10    200       25     3   1.25 0.67 40.91705   FALSE
29    240   5.0  14.0     12    190       25     3   1.33 0.67 41.01549    TRUE
30    135   0.0  13.0     12     25       25     2   1.00 0.75 28.02576   FALSE
31     45   0.0  11.0     15     40       25     1   1.00 0.88 35.25244   FALSE
32    280   0.0  15.0      9     45       25     2   1.00 0.75 23.80404   FALSE
33    140   3.0  15.0      5     85       25     3   1.00 0.88 52.07690   FALSE
34    170   3.0  17.0      3     90       25     3   1.00 0.25 53.37101   FALSE
35     75   3.0  13.0      4    100       25     3   1.00 0.33 45.81172   FALSE
36    220   1.0  12.0     11     45       25     2   1.00 1.00 21.87129   FALSE
37    250   1.5  11.5     10     90       25     1   1.00 0.75 31.07222   FALSE
38    180   0.0  14.0     11     35       25     1   1.00 1.33 28.74241   FALSE
39    170   1.0  17.0      6     60      100     3   1.00 1.00 36.52368   FALSE
40    170   2.0  20.0      9     95      100     3   1.30 0.75 36.47151   FALSE
41    260   0.0  21.0      3     40       25     2   1.00 1.50 39.24111   FALSE
42    150   2.0  12.0      6     95       25     2   1.00 0.67 45.32807   FALSE
43    180   0.0  12.0     12     55       25     2   1.00 1.00 26.73451   FALSE
44      0   0.0  16.0      3     95       25     2   1.00 1.00 54.85092   FALSE
45     95   3.0  16.0     11    170       25     3   1.00 1.00 37.13686   FALSE
46    150   3.0  16.0     11    170       25     3   1.00 1.00 34.13976   FALSE
47    150   3.0  17.0     13    160       25     3   1.50 0.67 30.31335   FALSE
48    220   2.0  15.0      6     90       25     1   1.00 1.00 40.10596   FALSE
49    190   0.0  15.0      9     40       25     2   1.00 0.67 29.92429   FALSE
50    220   3.0  21.0      7    130       25     3   1.33 0.67 40.69232   FALSE
51    170   3.0  18.0      2     90       25     3   1.00 1.00 59.64284   FALSE
52    170   1.5  13.5     10    120       25     3   1.25 0.50 30.45084   FALSE
53    200   6.0  11.0     14    260       25     3   1.33 0.67 37.84059    TRUE
54    320   1.0  20.0      3     45      100     3   1.00 1.00 41.50354   FALSE
55      0   0.0  13.0      0     15        0     3   0.50 1.00 60.75611   FALSE
56      0   1.0  10.0      0     50        0     3   0.50 1.00 63.00565   FALSE
57    135   2.0  14.0      6    110       25     3   1.00 0.50 49.51187   FALSE
58      0   2.7  -1.0     -1    110        0     1   1.00 0.67 50.82839   FALSE
59    210   5.0  14.0     12    240       25     2   1.33 0.75 39.25920    TRUE
60    140   2.5  10.5      8    140       25     3   1.00 0.50 39.70340    TRUE
61      0   2.0  15.0      6    110       25     3   1.00 0.50 55.33314   FALSE
62    240   0.0  23.0      2     30       25     1   1.00 1.13 41.99893   FALSE
63    290   0.0  22.0      3     35       25     1   1.00 1.00 40.56016   FALSE
64      0   3.0  16.0      0     95        0     1   0.83 1.00 68.23588   FALSE
65      0   4.0  19.0      0    140        0     1   1.00 0.67 74.47295    TRUE
66      0   3.0  20.0      0    120        0     1   1.00 0.67 72.80179   FALSE
67     70   1.0   9.0     15     40       25     2   1.00 0.75 31.23005   FALSE
68    230   1.0  16.0      3     55       25     1   1.00 1.00 53.13132   FALSE
69     15   3.0  15.0      5     90       25     2   1.00 1.00 59.36399   FALSE
70    200   0.0  21.0      3     35      100     3   1.00 1.00 38.83975   FALSE
71    190   4.0  15.0     14    230      100     3   1.50 1.00 28.59278    TRUE
72    200   3.0  16.0      3    110      100     3   1.00 1.00 46.65884   FALSE
73    250   0.0  21.0      3     60       25     3   1.00 0.75 39.10617   FALSE
74    140   0.0  13.0     12     25       25     2   1.00 1.00 27.75330   FALSE
75    230   3.0  17.0      3    115       25     1   1.00 0.67 49.78744   FALSE
76    200   3.0  17.0      3    110       25     1   1.00 1.00 51.59219   FALSE
77    200   1.0  16.0      8     60       25     1   1.00 0.75 36.18756   FALSE

Vectors

If you choose to work with vectors, it is important you remember how to index a vector’s contents!

  • You can grab the elements out of a vector with [] – read “where”
toung_twister3[c(3, 6, 10, 13)]
[1] "wood"      "woodchuck" "woodchuck" "wood?"    
  • If you want to replace those elements or change those elements use the assignment arrow!
toung_twister3[c(3, 6, 10, 13)] <- "WOOD!"